Added an API version of inference.py #611
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As W2L is quite old and its dependencies can not easily be ported to Python3.11 I wrote a small wrapper that starts a flask server serving W2L.
Additionally, as the application doesn't shut down after each call, the checkpoint isn't unloaded, and is only swapped if another checkpoint path is passed. This results in much better performance on subsequent calls (the first time is obviously still pretty slow if you use a different checkpoint than the default one)
The code in server.py is pretty much the same as that in inference.py. The json payload is minimal though and doesn't accept all parameters (yet?) , only those four :
The only additional requirement is flask